Cowley County
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models
Ormazabal, Aitor, Artetxe, Mikel, Agirre, Eneko
Methods for adapting language models (LMs) to new tasks and domains have traditionally assumed white-box access to the model, and work by modifying its parameters. However, this is incompatible with a recent trend in the field, where the highest quality models are only available as black-boxes through inference APIs. Even when the model weights are available, the computational cost of fine-tuning large LMs can be prohibitive for most practitioners. In this work, we present a lightweight method for adapting large LMs to new domains and tasks, assuming no access to their weights or intermediate activations. Our approach fine-tunes a small white-box LM and combines it with the large black-box LM at the probability level through a small network, learned on a small validation set. We validate our approach by adapting a large LM (OPT-30B) to several domains and a downstream task (machine translation), observing improved performance in all cases, of up to 9%, while using a domain expert 23x smaller.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (6 more...)
Assessments of epistemic uncertainty using Gaussian stochastic weight averaging for fluid-flow regression
Morimoto, Masaki, Fukami, Kai, Maulik, Romit, Vinuesa, Ricardo, Fukagata, Koji
We use Gaussian stochastic weight averaging (SWAG) to assess the model-form uncertainty associated with neural-network-based function approximation relevant to fluid flows. SWAG approximates a posterior Gaussian distribution of each weight, given training data, and a constant learning rate. Having access to this distribution, it is able to create multiple models with various combinations of sampled weights, which can be used to obtain ensemble predictions. The average of such an ensemble can be regarded as the `mean estimation', whereas its standard deviation can be used to construct `confidence intervals', which enable us to perform uncertainty quantification (UQ) with regard to the training process of neural networks. We utilize representative neural-network-based function approximation tasks for the following cases: (i) a two-dimensional circular-cylinder wake; (ii) the DayMET dataset (maximum daily temperature in North America); (iii) a three-dimensional square-cylinder wake; and (iv) urban flow, to assess the generalizability of the present idea for a wide range of complex datasets. SWAG-based UQ can be applied regardless of the network architecture, and therefore, we demonstrate the applicability of the method for two types of neural networks: (i) global field reconstruction from sparse sensors by combining convolutional neural network (CNN) and multi-layer perceptron (MLP); and (ii) far-field state estimation from sectional data with two-dimensional CNN. We find that SWAG can obtain physically-interpretable confidence-interval estimates from the perspective of model-form uncertainty. This capability supports its use for a wide range of problems in science and engineering.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Kansas > Cowley County (0.25)
- Energy > Oil & Gas > Upstream (0.68)
- Government > Regional Government > North America Government > United States Government (0.46)
Soil Erosion in the United States. Present and Future (2020-2050)
Shojaeezadeh, Shahab Aldin, Al-Wardy, Malik, Nikoo, Mohammad Reza, Mooselu, Mehrdad Ghorbani, Alizadeh, Mohammad Reza, Adamowski, Jan Franklin, Moradkhani, Hamid, Alamdari, Nasrin, Gandomi, Amir H.
Soil erosion is a significant threat to the environment and long-term land management around the world. Accelerated soil erosion by human activities inflicts extreme changes in terrestrial and aquatic ecosystems, which is not fully surveyed/predicted for the present and probable future at field-scales (30-m). Here, we estimate/predict soil erosion rates by water erosion, (sheet and rill erosion), using three alternative (2.6, 4.5, and 8.5) Shared Socioeconomic Pathway and Representative Concentration Pathway (SSP-RCP) scenarios across the contiguous United States. Field Scale Soil Erosion Model (FSSLM) estimations rely on a high resolution (30-m) G2 erosion model integrated by satellite- and imagery-based estimations of land use and land cover (LULC), gauge observations of long-term precipitation, and scenarios of the Coupled Model Intercomparison Project Phase 6 (CMIP6). The baseline model (2020) estimates soil erosion rates of 2.32 Mg ha 1 yr 1 with current agricultural conservation practices (CPs). Future scenarios with current CPs indicate an increase between 8% to 21% under different combinations of SSP-RCP scenarios of climate and LULC changes. The soil erosion forecast for 2050 suggests that all the climate and LULC scenarios indicate either an increase in extreme events or a change in the spatial location of extremes largely from the southern to the eastern and northeastern regions of the United States.
- North America > United States > Alabama (0.28)
- North America > United States > Kansas > Cowley County (0.24)
- Government > Regional Government > North America Government > United States Government (1.00)
- Food & Agriculture > Agriculture (1.00)
- Energy > Oil & Gas > Upstream (0.68)
Artificial Intelligence Is Now Far Too Big To Be Limited To Computer Science 7wData
In a recent post, I explored current-day drivers of AI, which include gamers, hipsters and angels, who have applied the technology in highly unexpected ways. So, already, there are new forces at work bringing AI into the mainstream of business and society. In addition, C-level executives have increasing responsibility for the actions and behaviors of their AI-driven systems. The academic community recognizes that AI has left the computer science building, metaphorically speaking. "Currently, the scientists who most commonly study the behaviour of machines are the computer scientists, roboticists and engineers who have created the machines in the first place. These scientists may be expert mathematicians and engineers; however, they are typically not trained behaviorists," writes Iyad Rahwan, who leads the Scalable Cooperation group at the MIT Media Lab, along with a wide team of computer and data scientists from leading universities.
Artificial Intelligence Is Now Far Too Big To Be Limited To Computer Science
While all systems reflect the intentions and biases of their human creators, artificial intelligence takes things in a new direction, connecting inferences in unexpected, and often, unexplainable ways. That's why some leading voices in the academic and computer science community are calling for a new way of studying and understanding AI actions, which they call "machine behavior." In a recent post, I explored current-day drivers of AI, which include gamers, hipsters and angels, who have applied the technology in highly unexpected ways. So, already, there are new forces at work bringing AI into the mainstream of business and society. In addition, C-level executives have increasing responsibility for the actions and behaviors of their AI-driven systems.